# OBR Export

Data scraped on OBR publications, forecasts, and mediate salience compiled from the Office for Budget Responsibility (OBR) and Times websites.

For any questions please contact Finn McEvoy (f.l.mcevoy@lse.ac.uk).

## Files

### `obr_publications.csv`
A catalogue of downloadable files from the OBR website. Each row is one document.

| Column | Description |
|---|---|
| `title` | Document title |
| `url` | Direct download link |
| `publication_date` | Raw date string from the site |
| `publication_date_parsed` | Parsed date (YYYY-MM-DD) |
| `file_size` | Human-readable file size |
| `file_size_bytes` | File size in bytes |
| `file_type` | Format (pdf, xlsx, zip, etc.) |
| `meta_raw` | Raw metadata string from the page |
| `page_number` | Pagination page the entry was found on |

Coverage runs from the OBR's creation (2010) through September 2025. Includes Economic and Fiscal Outlooks, Fiscal Risks and Sustainability reports, monthly public finances commentaries, devolved forecasts, supplementary data, and administrative logs. Not all are fiscally significant.

---

### `psnb_revisions.csv`
OBR forecast-to-forecast revisions to Public Sector Net Borrowing (PSNB), derived from the Historical Official Forecasts Database.

| Column | Description |
|---|---|
| `date_of_forecast` | Date of the current forecast |
| `date_of_prev_forecast` | Date of the prior forecast being compared against |
| `psnb_revision_sum` | Sum of PSNB revisions across the forecast horizon (£bn) |
| `psnb_revision_pct_mean` | Mean percentage revision across the horizon |

Coverage: June 2010 – March 2025 (31 forecasts). Large positive values indicate upward borrowing revisions (e.g. Covid shock in 2020–21, post-mini-Budget in 2022); large negative values indicate improvement (e.g. 2013).

---

### `Times_OBR_Articles.csv`
Partial-text archive of Times/Sunday Times articles mentioning the OBR.

| Column | Description |
|---|---|
| `id` | Article UUID |
| `headline` / `short_headline` | Article title |
| `url` | Article URL |
| `publication` | `times` or `sundaytimes` |
| `section` | Editorial section |
| `published_datetime` / `published_date` | Publication timestamp |
| `authors` | Author list |
| `content_clean` | Plain-text body |
| `content_html` | HTML body |
| `keywords` / `topics` / `tags` | Taxonomy fields |
| `word_count` | Article word count |
| `image_*` | Lead image metadata |
| `has_video`, `comments_enabled`, `flags` | Content flags |

